Fix binary error truncation; add live pipeline progress from stderr#8
Merged
Fix binary error truncation; add live pipeline progress from stderr#8
Conversation
Co-authored-by: lmangani <1423657+lmangani@users.noreply.github.com>
Copilot
AI
changed the title
[WIP] Fix error in dit-vae model initialization
Fix dit-vae Metal tensor API crash on Apple Silicon M1–M4
Mar 7, 2026
Co-authored-by: lmangani <1423657+lmangani@users.noreply.github.com>
Copilot
AI
changed the title
Fix dit-vae Metal tensor API crash on Apple Silicon M1–M4
Fix binary error truncation; add live pipeline progress from stderr
Mar 7, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
The original failure error was invisible: the 500-char truncation was consumed entirely by ggml-metal's informational init log (
tensor API disabled for pre-M5 and pre-A19 devices— a normal log, not an error), leaving the actual crash reason cut off. A previous fix incorrectly treated this message as a GPU failure and added a CPU-only retry; that assumption was wrong and has been removed.Changes
server/src/services/acestep.ts-ngl 0fallback) — the ggml-metal init message is informational; the GPU works correctly on M1–M4runBinarygains anonLinecallback — stderr is split into lines and streamed to the caller in real time; full stderr is still accumulated for error reportingmakeLmProgressHandler— parses ace-qwen3 stderr intojob.stage/job.progress(0–50%):[Phase1] step N … tok/s→ 0–28%[Phase1] Decode→ 30%[Decode] step N … total codes … tok/s→ 30–50% (budget from[Phase2] max_tokens)makeDitVaeProgressHandler— parses dit-vae stderr intojob.stage/job.progress(50–100%):[DiT] step N/M→ 50–85%[DiT] Total generation→ 85%[VAE] Tiled decode N tiles/Tiled decode done→ 85–98%PROGRESS_LM_PHASE1_MAX,PROGRESS_LM_PHASE2_END,PROGRESS_DIT_END,PROGRESS_VAE_END) make the 0–100 allocation explicitThe frontend's existing job-status polling picks up
stageandprogressautomatically — no frontend changes needed..env.exampleRemoved incorrect Apple Silicon Metal retry guidance; kept
DIT_VAE_EXTRA_ARGS/ACE_QWEN3_EXTRA_ARGSwith a corrected general-purpose description.🔒 GitHub Advanced Security automatically protects Copilot coding agent pull requests. You can protect all pull requests by enabling Advanced Security for your repositories. Learn more about Advanced Security.